大多数怀孕和出生会导致良好的结果,但是并不常见,当发生时,它们可能会与母亲和婴儿的严重影响相关。预测建模有可能通过更好地理解风险因素,增强监视以及更及时,更适当的干预措施来改善结果,从而帮助产科医生提供更好的护理。对于三种类型的并发症,我们使用可解释的提升机(EBM)(玻璃箱模型)来识别和研究最重要的风险因素,以获得清晰度:(i)严重的孕妇发病率(SMM),(ii)(iii)早产启示性。在使用EBM的解释性来揭示出对风险促成的特征的惊人见解时,我们的实验表明EBM与其他黑盒ML方法(例如深神经网和随机森林)的准确性相匹配。
translated by 谷歌翻译
Many dynamical systems exhibit latent states with intrinsic orderings such as "ally", "neutral" and "enemy" relationships in international relations. Such latent states are evidenced through entities' cooperative versus conflictual interactions which are similarly ordered. Models of such systems often involve state-to-action emission and state-to-state transition matrices. It is common practice to assume that the rows of these stochastic matrices are independently sampled from a Dirichlet distribution. However, this assumption discards ordinal information and treats states and actions falsely as order-invariant categoricals, which hinders interpretation and evaluation. To address this problem, we propose the Ordered Matrix Dirichlet (OMD): rows are sampled conditionally dependent such that probability mass is shifted to the right of the matrix as we move down rows. This results in a well-ordered mapping between latent states and observed action types. We evaluate the OMD in two settings: a Hidden Markov Model and a novel Bayesian Dynamic Poisson Tucker Model tailored to political event data. Models built on the OMD recover interpretable latent states and show superior forecasting performance in few-shot settings. We detail the wide applicability of the OMD to other domains where models with Dirichlet-sampled matrices are popular (e.g. topic modeling) and publish user-friendly code.
translated by 谷歌翻译
Deep learning models that leverage large datasets are often the state of the art for modelling molecular properties. When the datasets are smaller (< 2000 molecules), it is not clear that deep learning approaches are the right modelling tool. In this work we perform an extensive study of the calibration and generalizability of probabilistic machine learning models on small chemical datasets. Using different molecular representations and models, we analyse the quality of their predictions and uncertainties in a variety of tasks (binary, regression) and datasets. We also introduce two simulated experiments that evaluate their performance: (1) Bayesian optimization guided molecular design, (2) inference on out-of-distribution data via ablated cluster splits. We offer practical insights into model and feature choice for modelling small chemical datasets, a common scenario in new chemical experiments. We have packaged our analysis into the DIONYSUS repository, which is open sourced to aid in reproducibility and extension to new datasets.
translated by 谷歌翻译
Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.
translated by 谷歌翻译
对于任何负责满足人类价值观或偏好的人工智能而言,平衡多个竞争和冲突目标是一项重要任务。冲突既是由于具有竞争价值的个体之间的错位而引起的,也是一个人之间的冲突价值体系之间的不对准。从规避损失的原则开始,我们设计了一组软目标决策的软最大化功能。在一组先前开发的环境中,板凳标记了这些功能,我们发现一种新的方法特别是“分裂功能exp-log averver over over over over over”(SFELLA),学习的速度比最先进的阈值对准目标方法\引用{vamplew_potential的_2021}对其进行了测试的四个任务中的三个,并在学习后达到了相同的最佳性能。 SFELLA还显示出相对鲁棒性的改善,以抵抗客观量表的变化,这可能突出了涉及环境动态分布变化的优势。必须从预印本中省略进一步的工作,但是在最终发布的版本中,我们将进一步将SFELLA与多目标奖励指数(更多)方法进行比较,表明SFELLA在简单的先前描述的觅食任务中的性能类似,但是,在经纪人工作时没有耗尽的新资源的经过修改的觅食环境中,SFELLA收集了更多的新资源,而在旧资源方面几乎没有成本。总体而言,我们发现SFELLA对于避免有时以阈值方法出现的问题而有用,并且在保留其保守的,避开逆转损失的激励结构的同时,比更多的奖励响应响应。
translated by 谷歌翻译
光学相干断层扫描血管造影(OCTA)可以非侵入地对眼睛的循环系统进行图像。为了可靠地表征视网膜脉管系统,有必要自动从这些图像中提取定量指标。这种生物标志物的计算需要对血管进行精确的语义分割。但是,基于深度学习的分割方法主要依赖于使用体素级注释的监督培训,这是昂贵的。在这项工作中,我们提出了一条管道,以合成具有本质上匹配的地面真实标签的大量逼真的八颗图像。从而消除了需要手动注释培训数据的需求。我们提出的方法基于两个新的组成部分:1)基于生理的模拟,该模拟对各种视网膜血管丛进行建模和2)基于物理学的图像增强套件,这些图像增强量模拟了八八章图像采集过程,包括典型文物。在广泛的基准测试实验中,我们通过成功训练视网膜血管分割算法来证明合成数据的实用性。在我们方法的竞争性定量和优越的定性性能的鼓励下,我们认为它构成了一种多功能工具,可以推进对八章图像的定量分析。
translated by 谷歌翻译
在医疗保健中使用人工智能(AI)的一个适当的道德框架已成为该技术越来越广泛地部署的关键。人工智能的进步具有提高个人水平上结果预测精度的承诺。然而,与任何复杂的人类相互作用一样,将这些技术添加到患者 - 阵容的相互作用中具有潜在的陷阱。尽管医生一直必须仔细考虑其行为的道德背景和含义,但详细的审议可能并没有跟上。我们在医疗保健互动中使用了一个共同但主要的挑战,披露坏消息(可能即将死亡),以说明杰里米·本瑟姆(Jeremy Bentham)在18世纪开发的“ Felicific Colculus”的哲学框架如何有及时的准Quantitative Quantitative AI时代的应用。我们展示了如何使用这种道德算法来评估七个相互排斥和详尽的领域,是否可以在道德上证明AI支持的作用是合理的。
translated by 谷歌翻译
Transfer learning increasingly becomes an important tool in handling data scarcity often encountered in machine learning. In the application of high-throughput thickness as a downstream process of the high-throughput optimization of optoelectronic thin films with autonomous workflows, data scarcity occurs especially for new materials. To achieve high-throughput thickness characterization, we propose a machine learning model called thicknessML that predicts thickness from UV-Vis spectrophotometry input and an overarching transfer learning workflow. We demonstrate the transfer learning workflow from generic source domain of generic band-gapped materials to specific target domain of perovskite materials, where the target domain data only come from limited number (18) of refractive indices from literature. The target domain can be easily extended to other material classes with a few literature data. Defining thickness prediction accuracy to be within-10% deviation, thicknessML achieves 92.2% (with a deviation of 3.6%) accuracy with transfer learning compared to 81.8% (with a deviation of 3.6%) 11.7% without (lower mean and larger standard deviation). Experimental validation on six deposited perovskite films also corroborates the efficacy of the proposed workflow by yielding a 10.5% mean absolute percentage error (MAPE).
translated by 谷歌翻译
语言模型既展示了定量的改进,又展示了新的定性功能,随着规模的增加。尽管它们具有潜在的变革性影响,但这些新能力的特征却很差。为了为未来的研究提供信息,为破坏性的新模型能力做准备,并改善社会有害的效果,至关重要的是,我们必须了解目前和近乎未来的能力和语言模型的局限性。为了应对这一挑战,我们介绍了超越模仿游戏基准(Big Bench)。 Big Bench目前由204个任务组成,由132家机构的442位作者贡献。任务主题是多样的,从语言学,儿童发展,数学,常识性推理,生物学,物理学,社会偏见,软件开发等等。 Big-Bench专注于被认为超出当前语言模型的功能的任务。我们评估了OpenAI的GPT型号,Google内部密集变压器体系结构和大型基础上的开关稀疏变压器的行为,跨越了数百万到数十亿个参数。此外,一个人类专家评估者团队执行了所有任务,以提供强大的基准。研究结果包括:模型性能和校准都随规模改善,但绝对的术语(以及与评估者的性能相比);在模型类中的性能非常相似,尽管带有稀疏性。逐渐和预测的任务通常涉及大量知识或记忆成分,而在临界规模上表现出“突破性”行为的任务通常涉及多个步骤或组成部分或脆性指标;社交偏见通常会随着含糊不清的环境而随着规模而增加,但这可以通过提示来改善。
translated by 谷歌翻译
可视化优化景观导致了数字优化的许多基本见解,并对优化技术进行了新的改进。但是,仅在少数狭窄的环境中生成了增强学习优化(“奖励表面”)的目标的可视化。这项工作首次介绍了27个最广泛使用的增强学习环境的奖励表面和相关的可视化。我们还探索了政策梯度方向上的奖励表面,并首次表明许多流行的强化学习环境经常出现“悬崖”(预期回报中突然下降)。我们证明,A2C经常将这些悬崖“脱落”到参数空间的低奖励区域,而PPO避免了它们,这证实了PPO对PPO的流行直觉,以改善以前的方法。我们还引入了一个高度可扩展的库,该库使研究人员将来可以轻松地生成这些可视化。我们的发现提供了新的直觉,以解释现代RL方法的成功和失败,我们的可视化构成了以新颖方式进行强化学习剂的几种失败模式。
translated by 谷歌翻译